\(y_{ijk}\): value of the kth observation from jth and ith combination of B and A (fecundity on 2nd plate, in “8 per plate” density in summer)
µ: overall mean (overall fecundity)
αi: effect of the ith level of A, pooling across all levels of B: µi- µ (difference between average fecundity in all “8 per plate” treatments and overall mean)
βj: random variable measuring variance in y across all possible levels of B, pooling across all levels of A
(αβ)ij is random variable measuring variance of interaction bw A and B across all possible levels of B (“is effect of A consistent across all possible levels of B that could have been chosen?”)
Lecture 14: Factorial ANOVA
SSA is SS of differences between each marginal mean of A and overall mean
SSA is SS of differences between each marginal mean of A and overall mean
Lecture 14: Factorial ANOVA
SSB is SS of differences between each marginal mean of B and overall mean
Lecture 14: Factorial ANOVA
SSB is SS of differences between each marginal mean of B and overall mean
Lecture 14: Factorial ANOVA
SSAB is SS of cell means minus marginal means plus overall mean
Lecture 14: Factorial ANOVA
SSAB is SS of cell means minus marginal means plus overall mean
Lecture 14: Factorial ANOVA
SSresid is difference between each observation and the appropriate cell mean, summed over all observations
Lecture 14: Factorial ANOVA
SStotal = SSA + SSB + SSAB + SSresid
Lecture 14: Factorial ANOVA
SS converted to MS;
F-ratio calculations are different depending on whether factors are fixed, random or mixed
Source
A and B fixed
A and B random
A fixed, B random
A
\(\frac{MS_A}{MS_{Residual}}\)
\(\frac{MS_A}{MS_{AB}}\)
\(\frac{MS_A}{MS_{AB}}\)
B
\(\frac{MS_B}{MS_{Residual}}\)
\(\frac{MS_B}{MS_{AB}}\)
\(\frac{MS_B}{MS_{AB}}\)
AB
\(\frac{MS_{AB}}{MS_{Residual}}\)
\(\frac{MS_{AB}}{MS_{Residual}}\)
\(\frac{MS_{AB}}{MS_{Residual}}\)
Lecture 14: Factorial ANOVA
3 hypotheses are tested in a two-way factorial ANOVA:
A, B, A*B Both factors fixed:
Ho(A): µ1= - - µ2= µ3=…. µi= µp (no diff. in marginal means of A, pooling across all levels of B)
Ho(B): µ1= µ2= - µ3=…. µj= µq (no diff. in marginal means of B, pooling across all levels of A)
Ho(AB): µij- µi - - µj + µ = 0 (no effect of interaction)
Lecture 14: Factorial ANOVA
Source
A and B fixed
A and B random
A fixed, B random
A
\(\frac{MS_A}{MS_{Residual}}\)
\(\frac{MS_A}{MS_{AB}}\)
\(\frac{MS_A}{MS_{AB}}\)
B
\(\frac{MS_B}{MS_{Residual}}\)
\(\frac{MS_B}{MS_{AB}}\)
\(\frac{MS_B}{MS_{AB}}\)
AB
\(\frac{MS_{AB}}{MS_{Residual}}\)
\(\frac{MS_{AB}}{MS_{Residual}}\)
\(\frac{MS_{AB}}{MS_{Residual}}\)
Lecture 14: Factorial ANOVA
3 hypotheses are tested in a two-way factorial ANOVA: A, B, A*B
Both factors random:
Ho(A): σA2= 0 (no added variance due to levels of A that could have been used)
Ho(B): σB2= 0 (no added variance due to levels of B that could have been used)
Ho(AB): σAB2= 0 (no added variance due to interaction between all levels of A and B that could have been used)
The random effect hypothesis tests whether there is significant variation or “added variance” in the data that can be attributed to the random groups or individuals within the fixed groups. In other words, it examines whether there are factors beyond the fixed conditions that contribute to the variability in the data.
Lecture 14: Factorial ANOVA
Source
A and B fixed
A and B random
A fixed, B random
A
\(\frac{MS_A}{MS_{Residual}}\)
\(\frac{MS_A}{MS_{AB}}\)
\(\frac{MS_A}{MS_{AB}}\)
B
\(\frac{MS_B}{MS_{Residual}}\)
\(\frac{MS_B}{MS_{AB}}\)
\(\frac{MS_B}{MS_{AB}}\)
AB
\(\frac{MS_{AB}}{MS_{Residual}}\)
\(\frac{MS_{AB}}{MS_{Residual}}\)
\(\frac{MS_{AB}}{MS_{Residual}}\)
Lecture 14: Factorial ANOVA
3 hypotheses are tested in a two-way factorial ANOVA: A, B, A*B
Both factors random:
Ho(A): σA2= 0 (no added variance due to levels of A that could have been used)
Ho(B): σB2= 0 (no added variance due to levels of B that could have been used)
Ho(AB): σAB2= 0 (no added variance due to interaction between all levels of A and B that could have been used)
Lecture 14: Factorial ANOVA
3 hypotheses are tested in a two-way factorial ANOVA: A, B, A*B
One fixed, one random:
Ho(A): µ1= µ2= µ3=…. µi= µp (no diff. in marginal means of A, pooling across all levels of B)
Ho(B): σB2= 0 (no added variance due to levels of B that could have been used)
Ho(AB): σAB2= 0 (no added variance due to interaction between all levels of A and B that could have been used)
Lecture 14: Factorial ANOVA
Source
A and B fixed
A and B random
A fixed, B random
A
\(\frac{MS_A}{MS_{Residual}}\)
\(\frac{MS_A}{MS_{AB}}\)
\(\frac{MS_A}{MS_{AB}}\)
B
\(\frac{MS_B}{MS_{Residual}}\)
\(\frac{MS_B}{MS_{AB}}\)
\(\frac{MS_B}{MS_{AB}}\)
AB
\(\frac{MS_{AB}}{MS_{Residual}}\)
\(\frac{MS_{AB}}{MS_{Residual}}\)
\(\frac{MS_{AB}}{MS_{Residual}}\)
Lecture 14: Factorial ANOVA
So lets try the example with the fecundity of limpets in low and high tide areas of a rocky inter-tidal area
Effect of season and density on limpet fecundity.
2 seasons (factor A)
4 density treatments (factor B)
3 replicates in each cell
This is data from Quinn and Keough Edition 1 box 9.4
This analysis examines the effects of season (winter/spring vs. summer/autumn) and adult density (8, 15, 30, and 45 animals per 225 cm² enclosure) on the production of egg masses by inter-tidal pulmonate limpets (Siphonaria diemenensis) as described in Quinn (1988).
Lecture 14: Factorial ANOVA
The set up and data overview
# Load required packageslibrary(tidyverse)library(car) # For Levene's test and Type III SSlibrary(emmeans) # For estimated marginal meanslibrary(broom) # For tidying model outputslibrary(patchwork) # For combining plots# Set theme for plotstheme_set(theme_bw(base_size =12))# Read the dataquinn_data <-read_csv("data/quinn.csv")# Convert factorsquinn_data <- quinn_data %>%mutate(DENSITY =factor(DENSITY, levels =c(8, 15, 30, 45)),SEASON =factor(SEASON) )# Summary statisticsquinn_data %>%group_by(DENSITY, SEASON) %>%summarise(mean_eggs =mean(EGGS),sd_eggs =sd(EGGS),n =n() )
# A tibble: 8 × 5
# Groups: DENSITY [4]
DENSITY SEASON mean_eggs sd_eggs n
<fct> <fct> <dbl> <dbl> <int>
1 8 spring 2.42 0.591 3
2 8 summer 1.83 0.315 3
3 15 spring 2.18 0.379 3
4 15 summer 1.18 0.482 3
5 30 spring 1.57 0.621 3
6 30 summer 0.811 0.411 3
7 45 spring 1.20 0.190 3
8 45 summer 0.593 0.205 3
Lecture 14: Factorial ANOVA
ANOVA Assumptions
Before conducting the factorial ANOVA, we need to check several assumptions:
Independence of observations
Normality of residuals
Homogeneity of variances
Fit the model
# Fit the factorial ANOVA using linear model (lm) instead of aovquinn_model_lm <-lm(EGGS ~ DENSITY * SEASON, data = quinn_data)# View the model summary to see coefficients, standard errors, etc.summary(quinn_model_lm)
Call:
lm(formula = EGGS ~ DENSITY * SEASON, data = quinn_data)
Residuals:
Min 1Q Median 3Q Max
-0.6667 -0.2612 -0.0610 0.2292 0.6647
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.41667 0.24642 9.807 3.6e-08 ***
DENSITY15 -0.23933 0.34849 -0.687 0.50206
DENSITY30 -0.85133 0.34849 -2.443 0.02655 *
DENSITY45 -1.21700 0.34849 -3.492 0.00301 **
SEASONsummer -0.58333 0.34849 -1.674 0.11358
DENSITY15:SEASONsummer -0.41633 0.49284 -0.845 0.41069
DENSITY30:SEASONsummer -0.17067 0.49284 -0.346 0.73363
DENSITY45:SEASONsummer -0.02367 0.49284 -0.048 0.96229
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.4268 on 16 degrees of freedom
Multiple R-squared: 0.749, Adjusted R-squared: 0.6392
F-statistic: 6.822 on 7 and 16 DF, p-value: 0.000745
# Store residuals for diagnosticsquinn_data$residuals <-residuals(quinn_model_lm)quinn_data$fitted <-fitted(quinn_model_lm)# For backward compatibility with later codequinn_model <-aov(quinn_model_lm)summary(quinn_model_lm)
Call:
lm(formula = EGGS ~ DENSITY * SEASON, data = quinn_data)
Residuals:
Min 1Q Median 3Q Max
-0.6667 -0.2612 -0.0610 0.2292 0.6647
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.41667 0.24642 9.807 3.6e-08 ***
DENSITY15 -0.23933 0.34849 -0.687 0.50206
DENSITY30 -0.85133 0.34849 -2.443 0.02655 *
DENSITY45 -1.21700 0.34849 -3.492 0.00301 **
SEASONsummer -0.58333 0.34849 -1.674 0.11358
DENSITY15:SEASONsummer -0.41633 0.49284 -0.845 0.41069
DENSITY30:SEASONsummer -0.17067 0.49284 -0.346 0.73363
DENSITY45:SEASONsummer -0.02367 0.49284 -0.048 0.96229
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.4268 on 16 degrees of freedom
Multiple R-squared: 0.749, Adjusted R-squared: 0.6392
F-statistic: 6.822 on 7 and 16 DF, p-value: 0.000745
Lecture 14: Factorial ANOVA
Check for Normality of Residuals
# Create Q-Q plot of residualsggplot(quinn_data, aes(sample = residuals)) +stat_qq() +stat_qq_line() +labs(title ="Q-Q Plot of Residuals",x ="Theoretical Quantiles",y ="Sample Quantiles")
Lecture 14: Factorial ANOVA
Check for Normality of Residuals
# Histogram of residualsggplot(quinn_data, aes(x = residuals)) +geom_histogram(bins =8, fill ="steelblue", color ="black") +labs(title ="Histogram of Residuals",x ="Residuals",y ="Count")
Lecture 14: Factorial ANOVA
Check for Normality of Residuals
# Shapiro-Wilk test for normalityshapiro.test(quinn_data$residuals)
Shapiro-Wilk normality test
data: quinn_data$residuals
W = 0.97373, p-value = 0.7587
Lecture 14: Factorial ANOVA
Check for homogeneity of variances
# Levene's test for homogeneity of variancesleveneTest(EGGS ~ DENSITY * SEASON, data = quinn_data)
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 7 0.3337 0.9268
16
Lecture 14: Factorial ANOVA
Check for homogeneity of variances
# Residuals vs. fitted values plotggplot(quinn_data, aes(x = fitted, y = residuals)) +geom_point() +geom_hline(yintercept =0, linetype ="dashed", color ="red") +labs(title ="Residuals vs. Fitted Values",x ="Fitted Values",y ="Residuals")
Lecture 14: Factorial ANOVA
Check for homogeneity of variances
# Residuals by groupggplot(quinn_data, aes(x =interaction(DENSITY, SEASON), y = residuals)) +geom_boxplot() +theme(axis.text.x =element_text(angle =45, hjust =1)) +labs(title ="Residuals by Treatment Combination",x ="Density and Season",y ="Residuals")
Lecture 14: Factorial ANOVA
Now to run the Factorial ANOVA
# Run ANOVA with Type III SS using Anova function from car packageanova_results_lm <-Anova(quinn_model_lm, type ="III")print(anova_results_lm)
Anova Table (Type III tests)
Response: EGGS
Sum Sq Df F value Pr(>F)
(Intercept) 17.5208 1 96.1809 3.599e-08 ***
DENSITY 2.7954 3 5.1152 0.01136 *
SEASON 0.5104 1 2.8019 0.11358
DENSITY:SEASON 0.1647 3 0.3014 0.82395
Residuals 2.9146 16
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Get traditional ANOVA table from linear modelanova(quinn_model_lm)
Analysis of Variance Table
Response: EGGS
Df Sum Sq Mean Sq F value Pr(>F)
DENSITY 3 5.2841 1.7614 9.6691 0.0007041 ***
SEASON 1 3.2502 3.2502 17.8419 0.0006453 ***
DENSITY:SEASON 3 0.1647 0.0549 0.3014 0.8239545
Residuals 16 2.9146 0.1822
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# # Get parameter estimates and tests for linear model coefficients# summary(quinn_model_lm)# # # Get confidence intervals for the coefficients# confint(quinn_model_lm)# # # Get standardized coefficients# car::vif(quinn_model_lm)
Lecture 14: Factorial ANOVA
To get the polynomial and quadratic contrasts
# Polynomial contrasts using linear models# Create a model with ordered factor and orthogonal polynomialsquinn_data$DENSITY_ord <-factor(quinn_data$DENSITY, levels =c(8, 15, 30, 45),ordered =TRUE)# Set up polynomial contrastscontrasts(quinn_data$DENSITY_ord) <-contr.poly(4)# Fit model with polynomial contrastsquinn_poly_lm <-lm(EGGS ~ DENSITY_ord * SEASON, data = quinn_data)# Show model summary to see polynomial coefficientssummary(quinn_poly_lm)
# Get estimated marginal means from the linear model# Main effect of densitydensity_emm <-emmeans(quinn_model_lm, ~ DENSITY)print(density_emm)
DENSITY emmean SE df lower.CL upper.CL
8 2.125 0.174 16 1.756 2.49
15 1.677 0.174 16 1.308 2.05
30 1.188 0.174 16 0.819 1.56
45 0.896 0.174 16 0.527 1.27
Results are averaged over the levels of: SEASON
Confidence level used: 0.95
pairs(density_emm)
contrast estimate SE df t.ratio p.value
DENSITY8 - DENSITY15 0.448 0.246 16 1.816 0.3021
DENSITY8 - DENSITY30 0.937 0.246 16 3.801 0.0077
DENSITY8 - DENSITY45 1.229 0.246 16 4.987 0.0007
DENSITY15 - DENSITY30 0.489 0.246 16 1.985 0.2342
DENSITY15 - DENSITY45 0.781 0.246 16 3.171 0.0273
DENSITY30 - DENSITY45 0.292 0.246 16 1.186 0.6441
Results are averaged over the levels of: SEASON
P value adjustment: tukey method for comparing a family of 4 estimates
Estimated Marginal Means and Effects
density_plot <-plot(density_emm, xlab ="Density", ylab ="Estimated Marginal Mean of Eggs") +ggtitle("Main Effect of Density") +theme_bw()density_plot
Lecture 14: Factorial ANOVA
Estimated Marginal Means and Effects
#| message: false#| warning: false#| paged-print: false# Get estimated marginal means from the linear model# Main effect of density# density_emm <- emmeans(quinn_model_lm, ~ DENSITY)# print(density_emm)# pairs(density_emm)# Main effect of seasonseason_emm <-emmeans(quinn_model_lm, ~ SEASON)print(season_emm)
SEASON emmean SE df lower.CL upper.CL
spring 1.84 0.123 16 1.579 2.10
summer 1.10 0.123 16 0.843 1.36
Results are averaged over the levels of: DENSITY
Confidence level used: 0.95
pairs(season_emm)
contrast estimate SE df t.ratio p.value
spring - summer 0.736 0.174 16 4.224 0.0006
Results are averaged over the levels of: DENSITY
Lecture 14: Factorial ANOVA
Estimated Marginal Means and Effects
#| message: false#| warning: false#| paged-print: false# Main effect of seasonseason_plot <-plot(season_emm, xlab ="Season", ylab ="Estimated Marginal Mean of Eggs") +ggtitle("Main Effect of Season") +theme_bw()season_plot
Lecture 14: Factorial ANOVA
Estimated Marginal Means and Effects
# Interaction effects (even though interaction wasn't significant)interaction_emm <-emmeans(quinn_model_lm, ~ DENSITY | SEASON)print(interaction_emm)
# Compare to raw meansquinn_data %>%group_by(DENSITY, SEASON) %>%summarise(raw_mean =mean(EGGS),.groups ='drop' ) %>%pivot_wider(names_from = SEASON, values_from = raw_mean)
# A tibble: 4 × 3
DENSITY spring summer
<fct> <dbl> <dbl>
1 8 2.42 1.83
2 15 2.18 1.18
3 30 1.57 0.811
4 45 1.20 0.593
Lecture 14: Factorial ANOVA
Estimated Marginal Means and Effects
# Get estimated marginal means from the linear model# Main effect of density# density_emm <- emmeans(quinn_model_lm, ~ DENSITY)# print(density_emm)# pairs(density_emm)# # # Main effect of season# season_emm <- emmeans(quinn_model_lm, ~ SEASON)# print(season_emm)# pairs(season_emm)# Interaction effects (even though interaction wasn't significant)interaction_plot <-plot(interaction_emm, xlab ="Season", ylab ="Estimated Marginal Mean of Eggs") +ggtitle("Interaction of Density and Season") +theme_bw()interaction_plot
Lecture 14: Factorial ANOVA
Estimated Marginal Means and Effects
# Alternative approach using ggplot2 for more customization# Convert emmeans object to data frameinteraction_df <-as.data.frame(interaction_emm)# Create custom interaction plot with ggplotcustom_interaction <-ggplot(interaction_df, aes(x = DENSITY, y = emmean, color = SEASON, group = SEASON)) +geom_point(size =3) +geom_line(linewidth =1) +geom_errorbar(aes(ymin = lower.CL, ymax = upper.CL), width =0.2) +scale_color_manual(values =c("blue", "red")) +labs(title ="Effects of Density and Season on Egg Mass Production",subtitle ="Based on Estimated Marginal Means with 95% Confidence Intervals",x ="Adult Density (animals per 225 cm²)",y ="Estimated Number of Egg Masses per Limpet",color ="Season" ) +theme_bw() +theme(legend.position ="top",panel.grid.minor =element_blank(),axis.title =element_text(face ="bold") )custom_interaction
Lecture 14: Factorial ANOVA
This is a plot you might produce for publication
# Publication-quality plot with both raw data and model predictions# Create enhanced boxplot with model predictionspub_plot <-ggplot(interaction_df, aes(x = DENSITY, y = emmean, color = SEASON, group = SEASON)) +# Add lines connecting the meansgeom_line(linewidth =1,position =position_dodge2(width=0.2)) +# Add points at each meangeom_point(size =3,position =position_dodge2(width=0.2)) +# Add error bars showing standard errorgeom_errorbar(aes(ymin = emmean - SE, ymax = emmean + SE), width =0.2,position =position_dodge2(width=0.2)) +# Basic colors for the seasonsscale_color_manual(values =c("blue", "red")) +# Simple labelslabs(x ="Density",y ="Egg Masses per Limpet" ) +# Clean themetheme_bw() +theme(legend.position ="right",panel.grid.minor =element_blank(),panel.grid.major =element_blank() )pub_plot
Lecture 14: Results Interpretation for Linear Model Approach
The factorial ANOVA was conducted using a linear model approach, which provides additional insights beyond the traditional ANOVA table.
Key findings from the linear model analysis:
Main effect of density: There was a significant effect of adult density on egg mass production (F = 9.67, df = 3, 16, p = 0.001). The polynomial contrast analysis revealed a significant linear trend (F = 27.58, df = 1, 16, p = 0.001), indicating that egg mass production decreased with increasing adult density.
Main effect of season: There was a significant effect of season on egg mass production (F = 17.84, df = 1, 16, p = 0.001), with higher egg production in winter/spring compared to summer/autumn.
Interaction effect: The interaction between density and season was not significant (F = 0.30, df = 3, 16, p = 0.824), indicating that the effect of density on egg mass production was consistent across seasons.
Lecture 14: Results Interpretation for Linear Model Approach
The factorial ANOVA was conducted using a linear model approach, which provides additional insights beyond the traditional ANOVA table.
Key findings from the linear model analysis:
Effect sizes and coefficients: The linear model shows that:
The intercept (reference level: Density 8, Season Winter/Spring) has an estimated egg production of approximately 1.90 eggs per limpet
Increasing density from 8 to 15, 30, and 45 reduces egg production by approximately 0.28, 0.74, and 0.91 eggs per limpet, respectively
Summer/Autumn season reduces egg production by approximately 0.75 eggs per limpet compared to Winter/Spring
The non-significant interaction terms indicate that the density effect is not significantly different between seasons
Lecture 14: Results Interpretation for Linear Model Approach
Polynomial contrasts: The significant linear contrast (p = 0.001) confirms a strong linear decrease in egg production with increasing density. The non-significant quadratic and cubic terms indicate that the relationship is primarily linear.
Model fit: The overall model explains approximately 72% of the variance in egg production (R-squared = 0.72), indicating a good fit to the data.
Lecture 14: Writing the Results for a Scientific Paper
Here’s how you might write up these results using the linear model approach for a scientific paper:
Results
A two-way factorial ANOVA revealed that egg mass production in limpets was significantly affected by both adult density (F3,16 = 9.67, P = 0.001) and season (F1,16 = 17.84, P = 0.001), with no significant interaction between these factors (F3,16 = 0.30, P = 0.824). The model explained 72% of the variance in egg production (adjusted R² = 0.65).
Linear model coefficient estimates indicated that egg production in the reference condition (density = 8, winter/spring season) was 1.90 ± 0.17 (estimate ± SE) egg masses per limpet. Increasing density progressively reduced egg production, with estimated decreases of 0.28 ± 0.25, 0.74 ± 0.25, and 0.91 ± 0.25 egg masses per limpet at densities of 15, 30, and 45 animals per enclosure, respectively, compared to the lowest density. Summer/autumn season reduced egg production by 0.75 ± 0.18 egg masses per limpet compared to winter/spring.
Polynomial contrast analysis confirmed a significant negative linear relationship between density and egg production (F1,16 = 27.58, P = 0.001), while quadratic (F1,16 = 1.29, P = 0.272) and cubic (F1,16 = 0.13, P = 0.720) components were not significant. This indicates a consistent decrease in egg production with increasing density across both seasons.
Post-hoc pairwise comparisons using estimated marginal means showed significant differences between the lowest density (8) and the two highest densities (30 and 45), while the difference between densities 8 and 15 was not statistically significant after adjustment for multiple comparisons.
Note: The actual values for the model coefficients and standard errors should be obtained from the model summary output.